feat(ai): Foundation — Error Types + Constants [PR 1/7]#407
Conversation
Foundation layer for the AI testing framework. Introduces structured error handling via error-causes and runtime-validated configuration constants via Zod schemas. Updates eslint ecmaVersion to 2022 to support numeric separators and optional chaining used throughout the framework source. Files: - source/ai-errors.js — named error types (ParseError, ValidationError, etc.) - source/ai-errors.test.js — full coverage for error descriptors and createError - source/constants.js — defaults, constraints, and Zod schemas - source/constants.test.js — 26 tests covering all schemas and boundaries - eslint.config.js — bump ecmaVersion 2017 → 2022 (prerequisite) - package.json — add error-causes and zod to production dependencies Co-authored-by: Cursor <cursoragent@cursor.com>
Adds vitest.config.js e2e exclusion (source/e2e.test.js uses Riteway/Tape, not Vitest) alongside the eslint ecmaVersion 2022 bump already in place. Both changes are sourced from the working feature branch. Co-authored-by: Cursor <cursoragent@cursor.com>
- constants.js: lazy process.cwd() default (z.string().default(() => process.cwd())) prevents stale value when cwd changes after module load - constants.js: add concurrencyMax (50) to constraints + enforce in concurrencySchema - constants.js: remove JSDoc from internal constants (not public API) - constants.test.js: add full aiTestOptionsSchema coverage (valid input, missing filePath, empty filePath, invalid agent, lazy cwd default, optional agentConfigPath) - constants.test.js: add concurrencySchema upper-bound tests - ai-errors.test.js: replace for..of loops with test.each (one named test per case) - ai-errors.test.js: expand createError integration to cover two error types - ai-errors.test.js: replace typeof handleAIErrors check with behavioral routing tests - ai-errors.js: remove forward-reference comment (extraction-parser.js not yet in scope) - eslint.config.js: Object.assign -> spread operator Co-authored-by: Cursor <cursoragent@cursor.com>
|
@cursor please /review |
🔬 Code Review✅ Verification StatusAll checks pass successfully:
Code Quality Assessmentsource/ai-errors.js ✅Strengths:
Observations:
source/ai-errors.test.js ✅Strengths:
Observation: source/constants.js ✅Strengths:
Observations:
source/constants.test.js ✅Strengths:
Observations:
eslint.config.js ✅Changes:
Observation: vitest.config.js ✅Changes:
Observation: package.json ✅Changes:
Observation: Architecture & Design Decisions1. Error Handling Strategy ⭐
2. Validation Strategy ⭐
3. Test Design ⭐
Adherence to Project Standardsjavascript.mdc compliance: ✅
tdd.mdc compliance: ✅
Naming conventions: ✅
Security & PerformanceSecurity:
Performance:
RecommendationsNone. This PR represents excellent engineering discipline:
The established conventions ( SummaryStatus: ✅ APPROVED This is a well-crafted foundation PR. The error infrastructure and validation schemas are solid, tests are exemplary, and the correctness fixes (lazy |
eb0fa7b
into
ai-testing-framework-implementation-consolidation
|
|
||
| describe('constants module', () => { | ||
| describe('defaults', () => { | ||
| test('exports standard test configuration', () => { |
There was a problem hiding this comment.
I don't know if it makes sense to test constants. CC: @ericelliott
Tests should only cover behavior, shouldn't they? So for example here, if there is a default of 4 runs, we should test that in the function by invoking it without arguments, don't we?
There was a problem hiding this comment.
Yes, agreed. Test behaviors, not values, not types.
There was a problem hiding this comment.
Agreed - I admit I merged this too hastily and manually reviewed after the fact and saw this as well. Will clean up in a PR to the consolidated branch referencing these comments accordingly.
There was a problem hiding this comment.
There was a problem hiding this comment.
Similar comment to this testing file. We're kind just testing the error causes API.
If you want to test these at all, you should probably test them as pairs of getters and setters:
test('ParseError round-trips through handleAIErrors', () => {
const err = createError(ParseError);
let result;
handleAIErrors({
...allNoop,
ParseError: (cause) => { result = cause; }
})(err);
assert({
given: 'an error created from ParseError',
should: 'route to the correct handler with the right cause name',
actual: result.name,
expected: 'ParseError'
});
});The createError is the setter, handleAIErrors is the getter. That's the only contract worth testing — that errors you create actually route to the right handler with the right cause. The descriptor shape tests (name, code, named export equality) are just testing that errorCauses returns what you passed in.
There was a problem hiding this comment.
This is just testing error causes. Don't do that. Do test that code that uses these throws the correct errors, and that functions handle them correctly. In other words, test how your consuming code USES errors, instead of testing the already tested error-causes API
There was a problem hiding this comment.
- test(ai-errors): remove error-causes API tests; keep only handleAIErrors behavioral routing (ericelliott/janhesters #407) - test(constants): remove defaults/constraints value-only blocks; replace tautological expected: defaults.X with literals (ericelliott #407) - fix(debug-logger): rename writeToFile→bufferEntry, process→logProcess export; add logFile type guard; circular ref safety in formatMessage; command() rest params; improved JSDoc (janhesters #408) - test(debug-logger): onTestFinished for all teardown; add circular ref and logFile TypeError tests; flush no-op debug:false (janhesters #408) - fix(limit-concurrency): guard non-positive limit with RangeError; onTestFinished for fake timer teardown; document fail-fast (janhesters #408) - test(agent-parser): replace partial assertions with full expected values including ndjsonLength (janhesters #409) - test(extraction-parser): replace 4x multi-assert blocks with single full-object assertions (janhesters #409) Co-authored-by: Cursor <cursoragent@cursor.com>
- test(ai-errors): remove error-causes API tests; keep only handleAIErrors behavioral routing (ericelliott/janhesters #407) - test(constants): remove defaults/constraints value-only blocks; replace tautological expected: defaults.X with literals (ericelliott #407) - fix(debug-logger): rename writeToFile→bufferEntry, process→logProcess export; add logFile type guard; circular ref safety in formatMessage; command() rest params; improved JSDoc (janhesters #408) - test(debug-logger): onTestFinished for all teardown; add circular ref and logFile TypeError tests; flush no-op debug:false (janhesters #408) - fix(limit-concurrency): guard non-positive limit with RangeError; onTestFinished for fake timer teardown; document fail-fast (janhesters #408) - test(agent-parser): replace partial assertions with full expected values including ndjsonLength (janhesters #409) - test(extraction-parser): replace 4x multi-assert blocks with single full-object assertions (janhesters #409) Co-authored-by: Cursor <cursoragent@cursor.com> * test(agent-parser): use full expected values - Replace JSON.stringify comparisons with direct object assertions - Collapse 4 partial error.cause assertions into single full-object assert in parseOpenCodeNDJSON error test - Expand partial error.cause?.name assertion to full cause object in unwrapAgentResult error test Addresses Jan's PR #409 comment: deterministic functions should assert the complete expected value, not individual properties. Co-authored-by: Cursor <cursoragent@cursor.com> * test(ai): replace partial assertions with full expected values Per Jan's review: deterministic functions should assert the complete expected value, not individual properties. - extraction-parser: collapse ExtractionParseError and ExtractionValidationError cause assertions to full objects; comment .name usage (SyntaxError sets it as own property) - tap-yaml: consolidate per-property result asserts to full objects; collapse error cause to single full-object assert; remove redundant typeof score check - execute-agent: collapse AgentProcessError, TimeoutError, ParseError cause assertions to full objects; comment 3-deep .cause chain - aggregation: collapse ValidationError and ParseError test.each cause assertions to full objects; comment .constructor.name (ZodError does not set .name as own property); remove standalone normalizeJudgment ParseError test made redundant by test.each Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(ai): add error types, constants, and Zod schemas (PR 1/7) Foundation layer for the AI testing framework. Introduces structured error handling via error-causes and runtime-validated configuration constants via Zod schemas. Updates eslint ecmaVersion to 2022 to support numeric separators and optional chaining used throughout the framework source. Files: - source/ai-errors.js — named error types (ParseError, ValidationError, etc.) - source/ai-errors.test.js — full coverage for error descriptors and createError - source/constants.js — defaults, constraints, and Zod schemas - source/constants.test.js — 26 tests covering all schemas and boundaries - eslint.config.js — bump ecmaVersion 2017 → 2022 (prerequisite) - package.json — add error-causes and zod to production dependencies Co-authored-by: Cursor <cursoragent@cursor.com> * chore(config): bring working configs from feature branch Adds vitest.config.js e2e exclusion (source/e2e.test.js uses Riteway/Tape, not Vitest) alongside the eslint ecmaVersion 2022 bump already in place. Both changes are sourced from the working feature branch. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(ai): address PR 1 review findings - constants.js: lazy process.cwd() default (z.string().default(() => process.cwd())) prevents stale value when cwd changes after module load - constants.js: add concurrencyMax (50) to constraints + enforce in concurrencySchema - constants.js: remove JSDoc from internal constants (not public API) - constants.test.js: add full aiTestOptionsSchema coverage (valid input, missing filePath, empty filePath, invalid agent, lazy cwd default, optional agentConfigPath) - constants.test.js: add concurrencySchema upper-bound tests - ai-errors.test.js: replace for..of loops with test.each (one named test per case) - ai-errors.test.js: expand createError integration to cover two error types - ai-errors.test.js: replace typeof handleAIErrors check with behavioral routing tests - ai-errors.js: remove forward-reference comment (extraction-parser.js not yet in scope) - eslint.config.js: Object.assign -> spread operator Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
- test(ai-errors): remove error-causes API tests; keep only handleAIErrors behavioral routing (ericelliott/janhesters #407) - test(constants): remove defaults/constraints value-only blocks; replace tautological expected: defaults.X with literals (ericelliott #407) - fix(debug-logger): rename writeToFile→bufferEntry, process→logProcess export; add logFile type guard; circular ref safety in formatMessage; command() rest params; improved JSDoc (janhesters #408) - test(debug-logger): onTestFinished for all teardown; add circular ref and logFile TypeError tests; flush no-op debug:false (janhesters #408) - fix(limit-concurrency): guard non-positive limit with RangeError; onTestFinished for fake timer teardown; document fail-fast (janhesters #408) - test(agent-parser): replace partial assertions with full expected values including ndjsonLength (janhesters #409) - test(extraction-parser): replace 4x multi-assert blocks with single full-object assertions (janhesters #409) Co-authored-by: Cursor <cursoragent@cursor.com> * test(agent-parser): use full expected values - Replace JSON.stringify comparisons with direct object assertions - Collapse 4 partial error.cause assertions into single full-object assert in parseOpenCodeNDJSON error test - Expand partial error.cause?.name assertion to full cause object in unwrapAgentResult error test Addresses Jan's PR #409 comment: deterministic functions should assert the complete expected value, not individual properties. Co-authored-by: Cursor <cursoragent@cursor.com> * test(ai): replace partial assertions with full expected values Per Jan's review: deterministic functions should assert the complete expected value, not individual properties. - extraction-parser: collapse ExtractionParseError and ExtractionValidationError cause assertions to full objects; comment .name usage (SyntaxError sets it as own property) - tap-yaml: consolidate per-property result asserts to full objects; collapse error cause to single full-object assert; remove redundant typeof score check - execute-agent: collapse AgentProcessError, TimeoutError, ParseError cause assertions to full objects; comment 3-deep .cause chain - aggregation: collapse ValidationError and ParseError test.each cause assertions to full objects; comment .constructor.name (ZodError does not set .name as own property); remove standalone normalizeJudgment ParseError test made redundant by test.each Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(ai): add error types, constants, and Zod schemas (PR 1/7) Foundation layer for the AI testing framework. Introduces structured error handling via error-causes and runtime-validated configuration constants via Zod schemas. Updates eslint ecmaVersion to 2022 to support numeric separators and optional chaining used throughout the framework source. Files: - source/ai-errors.js — named error types (ParseError, ValidationError, etc.) - source/ai-errors.test.js — full coverage for error descriptors and createError - source/constants.js — defaults, constraints, and Zod schemas - source/constants.test.js — 26 tests covering all schemas and boundaries - eslint.config.js — bump ecmaVersion 2017 → 2022 (prerequisite) - package.json — add error-causes and zod to production dependencies Co-authored-by: Cursor <cursoragent@cursor.com> * chore(config): bring working configs from feature branch Adds vitest.config.js e2e exclusion (source/e2e.test.js uses Riteway/Tape, not Vitest) alongside the eslint ecmaVersion 2022 bump already in place. Both changes are sourced from the working feature branch. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(ai): address PR 1 review findings - constants.js: lazy process.cwd() default (z.string().default(() => process.cwd())) prevents stale value when cwd changes after module load - constants.js: add concurrencyMax (50) to constraints + enforce in concurrencySchema - constants.js: remove JSDoc from internal constants (not public API) - constants.test.js: add full aiTestOptionsSchema coverage (valid input, missing filePath, empty filePath, invalid agent, lazy cwd default, optional agentConfigPath) - constants.test.js: add concurrencySchema upper-bound tests - ai-errors.test.js: replace for..of loops with test.each (one named test per case) - ai-errors.test.js: expand createError integration to cover two error types - ai-errors.test.js: replace typeof handleAIErrors check with behavioral routing tests - ai-errors.js: remove forward-reference comment (extraction-parser.js not yet in scope) - eslint.config.js: Object.assign -> spread operator Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
- test(ai-errors): remove error-causes API tests; keep only handleAIErrors behavioral routing (ericelliott/janhesters #407) - test(constants): remove defaults/constraints value-only blocks; replace tautological expected: defaults.X with literals (ericelliott #407) - fix(debug-logger): rename writeToFile→bufferEntry, process→logProcess export; add logFile type guard; circular ref safety in formatMessage; command() rest params; improved JSDoc (janhesters #408) - test(debug-logger): onTestFinished for all teardown; add circular ref and logFile TypeError tests; flush no-op debug:false (janhesters #408) - fix(limit-concurrency): guard non-positive limit with RangeError; onTestFinished for fake timer teardown; document fail-fast (janhesters #408) - test(agent-parser): replace partial assertions with full expected values including ndjsonLength (janhesters #409) - test(extraction-parser): replace 4x multi-assert blocks with single full-object assertions (janhesters #409) Co-authored-by: Cursor <cursoragent@cursor.com> * test(agent-parser): use full expected values - Replace JSON.stringify comparisons with direct object assertions - Collapse 4 partial error.cause assertions into single full-object assert in parseOpenCodeNDJSON error test - Expand partial error.cause?.name assertion to full cause object in unwrapAgentResult error test Addresses Jan's PR #409 comment: deterministic functions should assert the complete expected value, not individual properties. Co-authored-by: Cursor <cursoragent@cursor.com> * test(ai): replace partial assertions with full expected values Per Jan's review: deterministic functions should assert the complete expected value, not individual properties. - extraction-parser: collapse ExtractionParseError and ExtractionValidationError cause assertions to full objects; comment .name usage (SyntaxError sets it as own property) - tap-yaml: consolidate per-property result asserts to full objects; collapse error cause to single full-object assert; remove redundant typeof score check - execute-agent: collapse AgentProcessError, TimeoutError, ParseError cause assertions to full objects; comment 3-deep .cause chain - aggregation: collapse ValidationError and ParseError test.each cause assertions to full objects; comment .constructor.name (ZodError does not set .name as own property); remove standalone normalizeJudgment ParseError test made redundant by test.each Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
* feat(ai): add error types, constants, and Zod schemas (PR 1/7) Foundation layer for the AI testing framework. Introduces structured error handling via error-causes and runtime-validated configuration constants via Zod schemas. Updates eslint ecmaVersion to 2022 to support numeric separators and optional chaining used throughout the framework source. Files: - source/ai-errors.js — named error types (ParseError, ValidationError, etc.) - source/ai-errors.test.js — full coverage for error descriptors and createError - source/constants.js — defaults, constraints, and Zod schemas - source/constants.test.js — 26 tests covering all schemas and boundaries - eslint.config.js — bump ecmaVersion 2017 → 2022 (prerequisite) - package.json — add error-causes and zod to production dependencies Co-authored-by: Cursor <cursoragent@cursor.com> * chore(config): bring working configs from feature branch Adds vitest.config.js e2e exclusion (source/e2e.test.js uses Riteway/Tape, not Vitest) alongside the eslint ecmaVersion 2022 bump already in place. Both changes are sourced from the working feature branch. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(ai): address PR 1 review findings - constants.js: lazy process.cwd() default (z.string().default(() => process.cwd())) prevents stale value when cwd changes after module load - constants.js: add concurrencyMax (50) to constraints + enforce in concurrencySchema - constants.js: remove JSDoc from internal constants (not public API) - constants.test.js: add full aiTestOptionsSchema coverage (valid input, missing filePath, empty filePath, invalid agent, lazy cwd default, optional agentConfigPath) - constants.test.js: add concurrencySchema upper-bound tests - ai-errors.test.js: replace for..of loops with test.each (one named test per case) - ai-errors.test.js: expand createError integration to cover two error types - ai-errors.test.js: replace typeof handleAIErrors check with behavioral routing tests - ai-errors.js: remove forward-reference comment (extraction-parser.js not yet in scope) - eslint.config.js: Object.assign -> spread operator Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
- test(ai-errors): remove error-causes API tests; keep only handleAIErrors behavioral routing (ericelliott/janhesters #407) - test(constants): remove defaults/constraints value-only blocks; replace tautological expected: defaults.X with literals (ericelliott #407) - fix(debug-logger): rename writeToFile→bufferEntry, process→logProcess export; add logFile type guard; circular ref safety in formatMessage; command() rest params; improved JSDoc (janhesters #408) - test(debug-logger): onTestFinished for all teardown; add circular ref and logFile TypeError tests; flush no-op debug:false (janhesters #408) - fix(limit-concurrency): guard non-positive limit with RangeError; onTestFinished for fake timer teardown; document fail-fast (janhesters #408) - test(agent-parser): replace partial assertions with full expected values including ndjsonLength (janhesters #409) - test(extraction-parser): replace 4x multi-assert blocks with single full-object assertions (janhesters #409) Co-authored-by: Cursor <cursoragent@cursor.com> * test(agent-parser): use full expected values - Replace JSON.stringify comparisons with direct object assertions - Collapse 4 partial error.cause assertions into single full-object assert in parseOpenCodeNDJSON error test - Expand partial error.cause?.name assertion to full cause object in unwrapAgentResult error test Addresses Jan's PR #409 comment: deterministic functions should assert the complete expected value, not individual properties. Co-authored-by: Cursor <cursoragent@cursor.com> * test(ai): replace partial assertions with full expected values Per Jan's review: deterministic functions should assert the complete expected value, not individual properties. - extraction-parser: collapse ExtractionParseError and ExtractionValidationError cause assertions to full objects; comment .name usage (SyntaxError sets it as own property) - tap-yaml: consolidate per-property result asserts to full objects; collapse error cause to single full-object assert; remove redundant typeof score check - execute-agent: collapse AgentProcessError, TimeoutError, ParseError cause assertions to full objects; comment 3-deep .cause chain - aggregation: collapse ValidationError and ParseError test.each cause assertions to full objects; comment .constructor.name (ZodError does not set .name as own property); remove standalone normalizeJudgment ParseError test made redundant by test.each Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Remove describe('defaults') block which tested constant values
rather than behavior (ericelliott/janhesters pattern from PR #407).
Replace tautological defaults.X references in parseAIArgs defaults
test with literal values so the test actually verifies the defaults
are correct rather than just asserting X equals X.
Made-with: Cursor
Add `riteway ai <file>` CLI command for evaluating AI agent prompts against a configurable pass-rate threshold. Write tests as `.sudo` files, run them through any supported agent, and get TAP-formatted results with per-assertion pass rates across multiple runs. Features: - Two-agent evaluation loop: extraction agent parses the test file; result + judge agents score each assertion across N runs - `riteway ai <file> [--runs N] [--threshold P] [--timeout MS] [--agent NAME | --agent-config FILE] [--concurrency N] [--color]` - Declarative `outputFormat: 'json' | 'ndjson'` in agent configs (replaces non-serializable `parseOutput: fn`); configs fully serializable for custom JSON files - `riteway ai init [--force]` writes all built-in agent configs to `riteway.agent-config.json` for team customization - Three-level agent resolution: `--agent-config` > project registry > built-in defaults; custom registry agents pass `--agent <name>` - TAP markdown output written to `ai-evals/` with timestamped filenames - E2E test suite (`npm run test:e2e`) covering full workflow, validation error paths, and SudoLang userPrompt handling - Supports Claude, OpenCode, and Cursor agents via OAuth (no API keys) Delivered across 8 focused PRs (#407–#423) into this consolidation branch. 211 tests (14 files), 0 lint errors. Closes #394.


Context
Part of the consolidation effort for draft PR #394. Per Eric's direction, the 80+ commit / 104-file feature branch is being decomposed into 7 small, focused PRs targeting
ai-testing-framework-implementation-consolidation, which will eventually land as a single clean PR againstmaster.Consolidation order: 1 → 2 → 3 → 4 → 5 → 6 → 7 → master
What's in this PR
PR 1 of 7: Foundation — the leaf modules everything else depends on.
New source files
source/ai-errors.jserror-causes:ParseError,ValidationError,SecurityError,TimeoutError,AgentProcessError,AITestError,OutputError,ExtractionParseError,ExtractionValidationErrorsource/ai-errors.test.jscreateErrorintegration (2 types), andhandleAIErrorsrouting (exhaustive handler map pattern)source/constants.jsdefaults,constraints, and Zod schemas for all AI runner config options:runsSchema,thresholdSchema,concurrencySchema(withconcurrencyMax: 50),timeoutSchema,agentSchema,calculateRequiredPassesSchema,aiTestOptionsSchemasource/constants.test.jsaiTestOptionsSchemacoverage including lazycwddefaultConfig changes
eslint.config.jsecmaVersion: 2017 → 2022(prerequisite for numeric separators + optional chaining used throughout the framework);Object.assign→ spreadvitest.config.jssource/e2e.test.js(uses Riteway/TAP, not Vitest — sourced from feature branch)package.jsonerror-causesmoved fromdevDependencies→dependencies(runtime CLI dep);zodadded todependenciesEpic requirements addressed
From the original epic:
error-causespattern) — every downstream module usescreateErrorfrom these descriptorsaiTestOptionsSchemais the primary validation boundary forriteway ai <file>WIP fixes from the consolidation plan
No WIP fixes were scoped to this module. All issues in the scratch pad apply to later PRs.
Key correctness decisions made during consolidation
Lazy
process.cwd()default — the original feature branch had:Fixed to:
Prevents stale value if cwd changes after module import (common in test environments).
concurrencyMax: 50—concurrencySchemaon the feature branch had no upper bound, which would acceptconcurrency: 10000and exhaust file descriptors. AddedconcurrencyMax: 50toconstraintsand enforced in the schema.test.eachconvention established — feature branch usedfor...ofloops inside single test blocks. Replaced with Vitesttest.eachthroughout so each case gets its own named test entry. This convention is documented in the consolidation plan for PRs 2–7.handleAIErrorsis exhaustive — requires all registered error types to have handlers or throwsMissingHandler. Tests use anallNoopspread pattern to satisfy this while testing routing of a single type.Verification
What's next
Made with Cursor